Statistical Models for Deep-structure Disambiguation
نویسندگان
چکیده
In this paper, an integrated score function is proposed to resolve the ambiguity of deepstructure, which includes the cases of constituents and the senses of words. With the integrated score function, different knowledge sources, including part-of-speech, syntax and semantics, are integrated in a uniform formulation. Based on this formulation, different models for case identification and word-sense disambiguation are derived. In the baseline system, the values of parameters are estimated by using the maximum likelihood estimation method. The accuracy rates of 56.3% for parse tree, 77.5% for case and 86.2% for word sense are obtained when the baseline system is tested on a corpus of 800 sentences. Afterwards, to reduce the estimation error caused by the maximum likelihood estimation, the Good-Turing's smoothing method is applied. In addition, a robust discriminative learning algorithm is also derived to minimize the testing set error rate. By applying these algorithms, the accuracy rates of 77% for parse tree, 88,9% for case, and 88.6% for sense are obtained. Compared with the baseline system; 17.4% error reduction rate for sense discrimination, 50.7% for case identification, and 47.4% for parsing accuracy are obtained. These results clearly demonstrate the superiority of the proposed models for deep-structure disambiguation.
منابع مشابه
A Statistical Approach towards Unknown Word Type Prediction for Deep Grammars
This paper presents a statistical approach to unknown word type prediction for a deep HPSG grammar. Our motivation is to enhance robustness in deep processing. With a predictor which predicts lexical types for unknown words according to the context, new lexical entries can be generated on the fly. The predictor is a maximum entropy based classifier trained on a HPSG treebank. By exploring vario...
متن کاملInvestigating the Role of Prior Disambiguation in Deep-learning Compositional Models of Meaning
This paper aims to explore the effect of prior disambiguation on neural networkbased compositional models, with the hope that better semantic representations for text compounds can be produced. We disambiguate the input word vectors before they are fed into a compositional deep net. A series of evaluations shows the positive effect of prior disambiguation for such deep models.
متن کاملTransition-Based Parsing for Deep Dependency Structures
Derivations under different grammar formalisms allow extraction of various dependency structures. Particularly, bilexical deep dependency structures beyond surface tree representation can be derived from linguistic analysis grounded by CCG, LFG, and HPSG. Traditionally, these dependency structures are obtained as a by-product of grammar-guided parsers. In this article, we study the alternative ...
متن کاملDeep Joint Entity Disambiguation with Local Neural Attention
We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations. Key components are entity embeddings, a neural attention mechanism over local context windows, and a differentiable joint inference stage for disambiguation. Our approach thereby combines benefits of deep learning with more traditional approaches such as graphic...
متن کاملA Morphology-Aware Network for Morphological Disambiguation
Agglutinative languages such as Turkish, Finnish and Hungarian require morphological disambiguation before further processing due to the complex morphology of words. A morphological disambiguator is used to select the correct morphological analysis of a word. Morphological disambiguation is important because it generally is one of the first steps of natural language processing and its performan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996